I am relatively far away (although not so much so) from something of a completed manuscript for my thesis, so here is a collection of thoughts that provide a general direction for my intents
A way to quantify prefetching is to compare existing hardware prefetchers to some magical oracle prefetcher. You can do this by analyzing the specific prefetch streams generated by said prefetchers
When you compare these streams, you can generally separate them into four sets,
Set 1: prefetches made by hardware prefetchers that aren't made by the oracle. These aren't necessarily incorrect, but are deemed non-optimal by the oracle.
Set 2: prefetches made by both. These are simply optimal prefetches and aren't very interesting.
Set 3: prefetches made by the oracle that are theoretically possible for a hardware prefetcher to make. We just haven't designed one good enough to make them yet.
Set 4: prefetches made by the oracle that are impossible for a hardware prefetcher to make. Likey because the oracle uses future information to make these, which is impossible because intel doesn't know how to build a time machine.
It is easy to seperate sets 1, 2 and 3+4, but it's much harder to separate set 3 from set 4, which is the set we really care about, since set 3 are hard to make prefetches that can guide future prefetcher design.